Add capability boundary replay artifact by ProfRandom92 · Pull Request #152 · ProfRandom92/Comptextv7

ProfRandom92 · 2026-05-20T16:11:03Z

Motivation

Provide a deterministic, fixture-driven artifact that validates whether explicit capability-boundary commitments survive replay reconstruction, building on the MCP trace replay and graph-diff foundations.
Deliver deterministic evidence for capability-boundary node/edge survival without changing fixtures, README, workflows, validators, or adding external dependencies.

Description

Added a generator script scripts/generate_capability_boundary_replay_artifact.py that loads fixtures/manifest.json, reads original/*.json and reconstructed/*.json payloads, conservatively extracts only explicit structured capability-boundary data from supported keys, normalizes edges/nodes, and compares original vs reconstructed graphs using normalize_edges, nodes_from_edges, and compare_edges from the graph core.
Committed the produced artifact artifacts/capability_boundary_replay_results.json following the stable schema (artifact_id, generated_by, version, evaluation_mode, llm_judges, external_apis, families, global_summary) with deterministic ordering and no timestamps or environment fields.
Added tests tests/test_capability_boundary_replay_artifact.py that assert artifact existence, exact regeneration parity, top-level schema stability, determinism/sanitization, manifest alignment (family/fixture counts and IDs), capability-boundary evidence/drift behavior (including zero-data handling), and label discipline using only registered failure labels.
Scope note: deterministic capability-boundary replay artifact only; no fixture payload changes, no README or workflow changes, no runtime/orchestration behavior, no new failure labels, and no LLM/embedding/vector/fuzzy/network behavior.

Testing

Ran the generator: python scripts/generate_capability_boundary_replay_artifact.py which produces artifacts/capability_boundary_replay_results.json and matches the committed artifact exactly.
Ran unit tests: pytest tests/test_capability_boundary_replay_artifact.py -q (all tests passed) and additionally ran pytest suites relied on (tests/test_graph_diff_artifact.py, tests/test_replay_graph_core.py, tests/test_fixture_manifest.py, tests/test_failure_taxonomy.py) as well as the full test run where 262 passed was observed under npm run check.
Determinism checks: generator run twice on tmp outputs produced identical content and artifact contains no timestamps, absolute paths, or environment/user fields.
Risks: extraction is intentionally conservative and only uses explicit structured capability-boundary fields so relations expressed only in prose are excluded by design.

Codex Task

gemini-code-assist

Code Review

This pull request introduces a system for generating and testing deterministic capability-boundary replay artifacts. It includes a generation script that extracts boundary graphs from JSON payloads, a sample artifact file, and a comprehensive test suite. Review feedback focuses on optimizing memory efficiency by refactoring the graph extraction logic to use iterables and generator expressions, ensuring the script can handle large numbers of payloads without excessive memory consumption.

gemini-code-assist · 2026-05-20T16:14:13Z

+import sys
+from collections import defaultdict
+from pathlib import Path
+from typing import Any


Add Iterable to the imports to support memory-efficient processing of payloads using generators in the following functions.

Suggested change

from typing import Any

from typing import Any, Iterable

gemini-code-assist · 2026-05-20T16:14:13Z

+def _extract_boundary_graph(payloads: list[dict[str, Any]]) -> tuple[tuple[tuple[str, str], ...], tuple[str, ...]]:
+    edges: list[tuple[str, str]] = []
+    nodes: set[str] = set()
+
+    for relation_key in SUPPORTED_RELATION_KEYS:
+        for payload in payloads:
+            for relation_value in _collect_relation_values(payload, relation_key):
+                rel_edges, rel_nodes = _extract_relation_data(relation_value, relation_key)
+                edges.extend(rel_edges)
+                nodes.update(rel_nodes)


The current implementation of _extract_boundary_graph iterates over the payloads list for each key in SUPPORTED_RELATION_KEYS. By swapping the loops, you can process each payload fully in a single pass. This also allows the function to accept an Iterable instead of a list, which is necessary for memory-efficient processing of large numbers of payloads using generators.

Suggested change

def _extract_boundary_graph(payloads: list[dict[str, Any]]) -> tuple[tuple[tuple[str, str], ...], tuple[str, ...]]:

edges: list[tuple[str, str]] = []

nodes: set[str] = set()

for relation_key in SUPPORTED_RELATION_KEYS:

for payload in payloads:

for relation_value in _collect_relation_values(payload, relation_key):

rel_edges, rel_nodes = _extract_relation_data(relation_value, relation_key)

edges.extend(rel_edges)

nodes.update(rel_nodes)

def _extract_boundary_graph(payloads: Iterable[dict[str, Any]]) -> tuple[tuple[tuple[str, str], ...], tuple[str, ...]]:

edges: list[tuple[str, str]] = []

nodes: set[str] = set()

for payload in payloads:

for relation_key in SUPPORTED_RELATION_KEYS:

for relation_value in _collect_relation_values(payload, relation_key):

rel_edges, rel_nodes = _extract_relation_data(relation_value, relation_key)

edges.extend(rel_edges)

nodes.update(rel_nodes)

gemini-code-assist · 2026-05-20T16:14:13Z

+            original_payloads = [_load_json(path) for path in _discover_payload_files(fixture_root / "original")]
+            replay_payloads = [_load_json(path) for path in _discover_payload_files(fixture_root / "reconstructed")]
+
+            original_edges, original_nodes = _extract_boundary_graph(original_payloads)
+            replay_edges, replay_nodes = _extract_boundary_graph(replay_payloads)


Loading all JSON payloads into memory at once using list comprehensions can be very memory-intensive as the number and size of fixtures grow. Using generator expressions combined with the updated _extract_boundary_graph (accepting an Iterable) significantly reduces the memory footprint by processing files one at a time.

Suggested change

original_payloads = [_load_json(path) for path in _discover_payload_files(fixture_root / "original")]

replay_payloads = [_load_json(path) for path in _discover_payload_files(fixture_root / "reconstructed")]

original_edges, original_nodes = _extract_boundary_graph(original_payloads)

replay_edges, replay_nodes = _extract_boundary_graph(replay_payloads)

original_edges, original_nodes = _extract_boundary_graph(

_load_json(path) for path in _discover_payload_files(fixture_root / "original")

)

replay_edges, replay_nodes = _extract_boundary_graph(

_load_json(path) for path in _discover_payload_files(fixture_root / "reconstructed")

)

Add capability boundary replay artifact

1d990a0

ProfRandom92 added the codex label May 20, 2026 — with ChatGPT Codex Connector

gemini-code-assist Bot reviewed May 20, 2026

View reviewed changes

ProfRandom92 merged commit 5213a8c into main May 20, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add capability boundary replay artifact#152

Add capability boundary replay artifact#152
ProfRandom92 merged 1 commit into
mainfrom
codex/add-deterministic-capability-boundary-replay-artifact

ProfRandom92 commented May 20, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 20, 2026

Uh oh!

gemini-code-assist Bot May 20, 2026

Uh oh!

gemini-code-assist Bot May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ProfRandom92 commented May 20, 2026

Motivation

Description

Testing

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant